Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 333 |
| Missing cells | 620 |
| Missing cells (%) | 10.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 44.7 KiB |
| Average record size in memory | 137.4 B |
Variable types
| Numeric | 13 |
|---|---|
| Text | 4 |
| Boolean | 1 |
commercial_property is highly overall correlated with crime_rate and 8 other fields | High correlation |
competitor_density is highly overall correlated with is_test | High correlation |
crime_rate is highly overall correlated with commercial_property and 8 other fields | High correlation |
household_affluency is highly overall correlated with commercial_property and 8 other fields | High correlation |
household_size is highly overall correlated with household_affluency and 2 other fields | High correlation |
is_test is highly overall correlated with commercial_property and 4 other fields | High correlation |
normalised_sales is highly overall correlated with commercial_property and 7 other fields | High correlation |
property_value is highly overall correlated with commercial_property and 6 other fields | High correlation |
proportion_flats is highly overall correlated with commercial_property and 4 other fields | High correlation |
proportion_newbuilds is highly overall correlated with commercial_property and 7 other fields | High correlation |
proportion_nonretail is highly overall correlated with commercial_property and 8 other fields | High correlation |
public_transport_dist is highly overall correlated with commercial_property and 6 other fields | High correlation |
school_proximity is highly overall correlated with normalised_sales and 1 other fields | High correlation |
is_test is highly imbalanced (76.2%) | Imbalance |
location_id has 13 (3.9%) missing values | Missing |
crime_rate has 13 (3.9%) missing values | Missing |
proportion_flats has 13 (3.9%) missing values | Missing |
proportion_nonretail has 13 (3.9%) missing values | Missing |
new_store has 13 (3.9%) missing values | Missing |
commercial_property has 42 (12.6%) missing values | Missing |
household_size has 13 (3.9%) missing values | Missing |
proportion_newbuilds has 13 (3.9%) missing values | Missing |
public_transport_dist has 13 (3.9%) missing values | Missing |
transport_availability has 13 (3.9%) missing values | Missing |
property_value has 13 (3.9%) missing values | Missing |
school_proximity has 76 (22.8%) missing values | Missing |
competitor_density has 13 (3.9%) missing values | Missing |
household_affluency has 13 (3.9%) missing values | Missing |
normalised_sales has 13 (3.9%) missing values | Missing |
county has 13 (3.9%) missing values | Missing |
location_id,crime_rate,proportion_flats,proportion_nonretail,new_store,commercial_property,household_size,proportion_newbuilds,public_transport_dist,transport_availability,property_value,school_proximity,competitor_density,household_affluency,county has 320 (96.1%) missing values | Missing |
proportion_flats has 238 (71.5%) zeros | Zeros |
proportion_newbuilds has 23 (6.9%) zeros | Zeros |
Reproduction
| Analysis started | 2024-02-09 10:50:04.687750 |
|---|---|
| Analysis finished | 2024-02-09 10:50:18.567780 |
| Duration | 13.88 seconds |
| Software version | ydata-profiling vv4.6.4 |
| Download configuration | config.json |
location_id
Real number (ℝ)
MISSING 
| Distinct | 320 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 252.3875 |
| Minimum | 1 |
|---|---|
| Maximum | 506 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 26.95 |
| Q1 | 126.5 |
| median | 251.5 |
| Q3 | 377.25 |
| 95-th percentile | 474.05 |
| Maximum | 506 |
| Range | 505 |
| Interquartile range (IQR) | 250.75 |
Descriptive statistics
| Standard deviation | 145.60058 |
|---|---|
| Coefficient of variation (CV) | 0.576893 |
| Kurtosis | -1.1937793 |
| Mean | 252.3875 |
| Median Absolute Deviation (MAD) | 126 |
| Skewness | -0.021020651 |
| Sum | 80764 |
| Variance | 21199.53 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 296 | 1 | 0.3% |
| 315 | 1 | 0.3% |
| 23 | 1 | 0.3% |
| 415 | 1 | 0.3% |
| 56 | 1 | 0.3% |
| 195 | 1 | 0.3% |
| 351 | 1 | 0.3% |
| 111 | 1 | 0.3% |
| 328 | 1 | 0.3% |
| 392 | 1 | 0.3% |
| Other values (310) | 310 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 7 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 17 | 1 |
| Value | Count | Frequency (%) |
| 506 | 1 | |
| 504 | 1 | |
| 503 | 1 | |
| 501 | 1 | |
| 500 | 1 | |
| 498 | 1 | |
| 494 | 1 | |
| 491 | 1 | |
| 489 | 1 | |
| 488 | 1 |
crime_rate
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 319 |
|---|---|
| Distinct (%) | 99.7% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.5963745 |
| Minimum | 0.0071416 |
|---|---|
| Maximum | 51.693093 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0.0071416 |
|---|---|
| 5-th percentile | 0.028051685 |
| Q1 | 0.0879366 |
| median | 0.28968115 |
| Q3 | 4.0635534 |
| 95-th percentile | 17.156496 |
| Maximum | 51.693093 |
| Range | 51.685951 |
| Interquartile range (IQR) | 3.9756168 |
Descriptive statistics
| Standard deviation | 7.1763415 |
|---|---|
| Coefficient of variation (CV) | 1.9954377 |
| Kurtosis | 13.231406 |
| Mean | 3.5963745 |
| Median Absolute Deviation (MAD) | 0.2520239 |
| Skewness | 3.2555291 |
| Sum | 1150.8399 |
| Variance | 51.499878 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.0169613 | 2 | 0.6% |
| 2.614707 | 1 | 0.3% |
| 20.435598 | 1 | 0.3% |
| 0.7208948 | 1 | 0.3% |
| 0.0494827 | 1 | 0.3% |
| 7.3887988 | 1 | 0.3% |
| 0.1486854 | 1 | 0.3% |
| 0.2419217 | 1 | 0.3% |
| 9.3203417 | 1 | 0.3% |
| 0.2770986 | 1 | 0.3% |
| Other values (309) | 309 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0.0071416 | 1 | |
| 0.0123848 | 1 | |
| 0.0147013 | 1 | |
| 0.0148143 | 1 | |
| 0.015368 | 1 | |
| 0.0161816 | 1 | |
| 0.0162607 | 1 | |
| 0.0169613 | 2 | |
| 0.0200914 | 1 | |
| 0.021131 | 1 |
| Value | Count | Frequency (%) |
| 51.693093 | 1 | |
| 43.337534 | 1 | |
| 42.557947 | 1 | |
| 32.381054 | 1 | |
| 29.312878 | 1 | |
| 28.302093 | 1 | |
| 28.025921 | 1 | |
| 27.564994 | 1 | |
| 25.534723 | 1 | |
| 24.917743 | 1 |
proportion_flats
Real number (ℝ)
HIGH CORRELATION  MISSING  ZEROS 
| Distinct | 25 |
|---|---|
| Distinct (%) | 7.8% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.673438 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 238 |
| Zeros (%) | 71.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 12.5 |
| 95-th percentile | 75.25 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 12.5 |
Descriptive statistics
| Standard deviation | 22.579232 |
|---|---|
| Coefficient of variation (CV) | 2.1154602 |
| Kurtosis | 4.8618828 |
| Mean | 10.673438 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.3679581 |
| Sum | 3415.5 |
| Variance | 509.82171 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 238 | |
| 20 | 13 | 3.9% |
| 80 | 7 | 2.1% |
| 22 | 7 | 2.1% |
| 25 | 7 | 2.1% |
| 12.5 | 6 | 1.8% |
| 45 | 5 | 1.5% |
| 75 | 3 | 0.9% |
| 55 | 3 | 0.9% |
| 21 | 3 | 0.9% |
| Other values (15) | 28 | 8.4% |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0 | 238 | |
| 12.5 | 6 | 1.8% |
| 17.5 | 1 | 0.3% |
| 18 | 1 | 0.3% |
| 20 | 13 | 3.9% |
| 21 | 3 | 0.9% |
| 22 | 7 | 2.1% |
| 25 | 7 | 2.1% |
| 28 | 2 | 0.6% |
| 30 | 3 | 0.9% |
| Value | Count | Frequency (%) |
| 100 | 1 | 0.3% |
| 95 | 3 | |
| 90 | 2 | 0.6% |
| 85 | 2 | 0.6% |
| 82.5 | 1 | 0.3% |
| 80 | 7 | |
| 75 | 3 | |
| 60 | 3 | |
| 55 | 3 | |
| 52.5 | 1 | 0.3% |
proportion_nonretail
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 67 |
|---|---|
| Distinct (%) | 20.9% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.307906 |
| Minimum | 0.74 |
|---|---|
| Maximum | 27.74 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0.74 |
|---|---|
| 5-th percentile | 2.18 |
| Q1 | 5.13 |
| median | 9.9 |
| Q3 | 18.1 |
| 95-th percentile | 21.89 |
| Maximum | 27.74 |
| Range | 27 |
| Interquartile range (IQR) | 12.97 |
Descriptive statistics
| Standard deviation | 7.0326934 |
|---|---|
| Coefficient of variation (CV) | 0.6219271 |
| Kurtosis | -1.2360987 |
| Mean | 11.307906 |
| Median Absolute Deviation (MAD) | 6.615 |
| Skewness | 0.28816142 |
| Sum | 3618.53 |
| Variance | 49.458776 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18.1 | 84 | |
| 19.58 | 20 | 6.0% |
| 6.2 | 13 | 3.9% |
| 8.14 | 12 | 3.6% |
| 21.89 | 10 | 3.0% |
| 9.9 | 9 | 2.7% |
| 3.97 | 7 | 2.1% |
| 4.05 | 7 | 2.1% |
| 5.86 | 7 | 2.1% |
| 8.56 | 7 | 2.1% |
| Other values (57) | 144 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0.74 | 1 | |
| 1.21 | 1 | |
| 1.22 | 1 | |
| 1.25 | 1 | |
| 1.32 | 1 | |
| 1.38 | 1 | |
| 1.47 | 1 | |
| 1.52 | 2 | |
| 1.69 | 1 | |
| 1.76 | 1 |
| Value | Count | Frequency (%) |
| 27.74 | 4 | 1.2% |
| 25.65 | 6 | 1.8% |
| 21.89 | 10 | 3.0% |
| 19.58 | 20 | 6.0% |
| 18.1 | 84 | |
| 15.04 | 2 | 0.6% |
| 13.92 | 3 | 0.9% |
| 13.89 | 1 | 0.3% |
| 12.83 | 4 | 1.2% |
| 11.93 | 4 | 1.2% |
new_store
Text
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Memory size | 2.7 KiB |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 2.059375 |
| Min length | 2 |
Characters and Unicode
| Total characters | 659 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | no |
|---|---|
| 2nd row | no |
| 3rd row | no |
| 4th row | no |
| 5th row | no |
| Value | Count | Frequency (%) |
| no | 301 | |
| yes | 19 | 5.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 301 | |
| o | 301 | |
| y | 19 | 2.9% |
| e | 19 | 2.9% |
| s | 19 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 659 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 301 | |
| o | 301 | |
| y | 19 | 2.9% |
| e | 19 | 2.9% |
| s | 19 | 2.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 659 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 301 | |
| o | 301 | |
| y | 19 | 2.9% |
| e | 19 | 2.9% |
| s | 19 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 659 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 301 | |
| o | 301 | |
| y | 19 | 2.9% |
| e | 19 | 2.9% |
| s | 19 | 2.9% |
commercial_property
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 76 |
|---|---|
| Distinct (%) | 26.1% |
| Missing | 42 |
| Missing (%) | 12.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.868557 |
| Minimum | 1.75 |
|---|---|
| Maximum | 1009 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 1.75 |
|---|---|
| 5-th percentile | 3.1 |
| Q1 | 5.45 |
| median | 9.4 |
| Q3 | 14.05 |
| 95-th percentile | 21 |
| Maximum | 1009 |
| Range | 1007.25 |
| Interquartile range (IQR) | 8.6 |
Descriptive statistics
| Standard deviation | 73.806051 |
|---|---|
| Coefficient of variation (CV) | 4.3753625 |
| Kurtosis | 149.49455 |
| Mean | 16.868557 |
| Median Absolute Deviation (MAD) | 4.3 |
| Skewness | 12.087634 |
| Sum | 4908.75 |
| Variance | 5447.3332 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18.15 | 11 | 3.3% |
| 9.4 | 11 | 3.3% |
| 13.7 | 10 | 3.0% |
| 26.05 | 10 | 3.0% |
| 4.35 | 10 | 3.0% |
| 12.75 | 9 | 2.7% |
| 6.95 | 9 | 2.7% |
| 9.7 | 8 | 2.4% |
| 4.05 | 7 | 2.1% |
| 17.5 | 7 | 2.1% |
| Other values (66) | 199 | |
| (Missing) | 42 | 12.6% |
| Value | Count | Frequency (%) |
| 1.75 | 1 | 0.3% |
| 1.95 | 1 | 0.3% |
| 2.55 | 3 | |
| 2.65 | 1 | 0.3% |
| 2.7 | 1 | 0.3% |
| 2.75 | 1 | 0.3% |
| 2.95 | 2 | |
| 3 | 1 | 0.3% |
| 3.05 | 4 | |
| 3.15 | 2 |
| Value | Count | Frequency (%) |
| 1009 | 1 | 0.3% |
| 767 | 1 | 0.3% |
| 123 | 1 | 0.3% |
| 26.05 | 10 | |
| 21 | 5 | |
| 19.5 | 7 | |
| 18.4 | 3 | 0.9% |
| 18.15 | 11 | |
| 17.5 | 7 | |
| 17.15 | 6 |
household_size
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 298 |
|---|---|
| Distinct (%) | 93.1% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.2528031 |
| Minimum | 0.561 |
|---|---|
| Maximum | 5.725 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0.561 |
|---|---|
| 5-th percentile | 2.18445 |
| Q1 | 2.87975 |
| median | 3.1975 |
| Q3 | 3.59725 |
| 95-th percentile | 4.47095 |
| Maximum | 5.725 |
| Range | 5.164 |
| Interquartile range (IQR) | 0.7175 |
Descriptive statistics
| Standard deviation | 0.69544191 |
|---|---|
| Coefficient of variation (CV) | 0.21379772 |
| Kurtosis | 1.9956769 |
| Mean | 3.2528031 |
| Median Absolute Deviation (MAD) | 0.334 |
| Skewness | 0.18421453 |
| Sum | 1040.897 |
| Variance | 0.48363944 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3.229 | 3 | 0.9% |
| 3.03 | 2 | 0.6% |
| 3.127 | 2 | 0.6% |
| 3.122 | 2 | 0.6% |
| 3.211 | 2 | 0.6% |
| 3.417 | 2 | 0.6% |
| 3.185 | 2 | 0.6% |
| 2.304 | 2 | 0.6% |
| 2.713 | 2 | 0.6% |
| 2.888 | 2 | 0.6% |
| Other values (288) | 299 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0.561 | 1 | |
| 0.863 | 1 | |
| 1.138 | 1 | |
| 1.368 | 1 | |
| 1.519 | 1 | |
| 1.652 | 1 | |
| 1.906 | 1 | |
| 1.926 | 1 | |
| 1.963 | 1 | |
| 1.97 | 1 |
| Value | Count | Frequency (%) |
| 5.725 | 1 | |
| 5.375 | 1 | |
| 5.266 | 1 | |
| 5.259 | 1 | |
| 5.247 | 1 | |
| 5.04 | 1 | |
| 5.034 | 1 | |
| 4.929 | 1 | |
| 4.923 | 1 | |
| 4.853 | 1 |
proportion_newbuilds
Real number (ℝ)
HIGH CORRELATION  MISSING  ZEROS 
| Distinct | 252 |
|---|---|
| Distinct (%) | 78.8% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.849062 |
| Minimum | 0 |
|---|---|
| Maximum | 94 |
| Zeros | 23 |
| Zeros (%) | 6.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 6.35 |
| median | 23.4 |
| Q3 | 54.45 |
| 95-th percentile | 82.205 |
| Maximum | 94 |
| Range | 94 |
| Interquartile range (IQR) | 48.1 |
Descriptive statistics
| Standard deviation | 27.845777 |
|---|---|
| Coefficient of variation (CV) | 0.87430443 |
| Kurtosis | -0.94057168 |
| Mean | 31.849062 |
| Median Absolute Deviation (MAD) | 19.95 |
| Skewness | 0.58915328 |
| Sum | 10191.7 |
| Variance | 775.38727 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 23 | 6.9% |
| 4 | 4 | 1.2% |
| 4.6 | 3 | 0.9% |
| 23.5 | 3 | 0.9% |
| 1.2 | 3 | 0.9% |
| 4.4 | 3 | 0.9% |
| 78.6 | 3 | 0.9% |
| 2.7 | 3 | 0.9% |
| 20.1 | 2 | 0.6% |
| 29.6 | 2 | 0.6% |
| Other values (242) | 271 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0 | 23 | |
| 0.7 | 1 | 0.3% |
| 1.1 | 1 | 0.3% |
| 1.2 | 3 | 0.9% |
| 1.3 | 1 | 0.3% |
| 1.5 | 1 | 0.3% |
| 1.6 | 2 | 0.6% |
| 1.8 | 2 | 0.6% |
| 1.9 | 1 | 0.3% |
| 2 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 94 | 1 | |
| 93.8 | 1 | |
| 93.5 | 1 | |
| 93.4 | 1 | |
| 92.2 | 2 | |
| 91.6 | 1 | |
| 91.1 | 1 | |
| 90.2 | 1 | |
| 90.1 | 1 | |
| 87 | 1 |
public_transport_dist
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 286 |
|---|---|
| Distinct (%) | 89.4% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.718765 |
| Minimum | 1.137 |
|---|---|
| Maximum | 10.7103 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 1.137 |
|---|---|
| 5-th percentile | 1.50955 |
| Q1 | 2.138075 |
| median | 3.09575 |
| Q3 | 5.1167 |
| 95-th percentile | 7.66147 |
| Maximum | 10.7103 |
| Range | 9.5733 |
| Interquartile range (IQR) | 2.978625 |
Descriptive statistics
| Standard deviation | 1.9847652 |
|---|---|
| Coefficient of variation (CV) | 0.53371623 |
| Kurtosis | 0.14082632 |
| Mean | 3.718765 |
| Median Absolute Deviation (MAD) | 1.1629 |
| Skewness | 0.94971955 |
| Sum | 1190.0048 |
| Variance | 3.9392931 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.8122 | 3 | 0.9% |
| 3.6519 | 3 | 0.9% |
| 5.2873 | 3 | 0.9% |
| 5.4007 | 3 | 0.9% |
| 6.4798 | 3 | 0.9% |
| 6.8147 | 3 | 0.9% |
| 6.0622 | 2 | 0.6% |
| 3.2721 | 2 | 0.6% |
| 7.3967 | 2 | 0.6% |
| 6.4584 | 2 | 0.6% |
| Other values (276) | 294 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 1.137 | 1 | |
| 1.1691 | 1 | |
| 1.1742 | 1 | |
| 1.2024 | 1 | |
| 1.2852 | 1 | |
| 1.3216 | 1 | |
| 1.3325 | 1 | |
| 1.3449 | 1 | |
| 1.358 | 1 | |
| 1.4191 | 1 |
| Value | Count | Frequency (%) |
| 10.7103 | 1 | |
| 9.2229 | 1 | |
| 9.1876 | 1 | |
| 9.0892 | 1 | |
| 8.9067 | 1 | |
| 8.7921 | 1 | |
| 8.6966 | 1 | |
| 8.5353 | 1 | |
| 8.344 | 1 | |
| 8.3248 | 1 |
MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Memory size | 2.7 KiB |
Length
| Max length | 25 |
|---|---|
| Median length | 22 |
| Mean length | 21.865625 |
| Min length | 20 |
Characters and Unicode
| Total characters | 6997 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | All transport options |
|---|---|
| 2nd row | Average transport options |
| 3rd row | Many transport options |
| 4th row | No transport options |
| 5th row | Average transport options |
| Value | Count | Frequency (%) |
| transport | 320 | |
| options | 320 | |
| all | 84 | 8.8% |
| average | 72 | 7.5% |
| few | 69 | 7.2% |
| no | 53 | 5.5% |
| many | 42 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 1013 | |
| t | 960 | |
| r | 712 | |
| n | 682 | |
| 640 | ||
| s | 640 | |
| p | 640 | |
| a | 434 | |
| i | 320 | 4.6% |
| e | 213 | 3.0% |
| Other values (9) | 743 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6037 | |
| Space Separator | 640 | 9.1% |
| Uppercase Letter | 320 | 4.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 1013 | |
| t | 960 | |
| r | 712 | |
| n | 682 | |
| s | 640 | |
| p | 640 | |
| a | 434 | |
| i | 320 | 5.3% |
| e | 213 | 3.5% |
| l | 168 | 2.8% |
| Other values (4) | 255 | 4.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 156 | |
| F | 69 | |
| N | 53 | 16.6% |
| M | 42 | 13.1% |
Space Separator
| Value | Count | Frequency (%) |
| 640 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6357 | |
| Common | 640 | 9.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 1013 | |
| t | 960 | |
| r | 712 | |
| n | 682 | |
| s | 640 | |
| p | 640 | |
| a | 434 | |
| i | 320 | 5.0% |
| e | 213 | 3.4% |
| l | 168 | 2.6% |
| Other values (8) | 575 |
Common
| Value | Count | Frequency (%) |
| 640 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6997 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 1013 | |
| t | 960 | |
| r | 712 | |
| n | 682 | |
| 640 | ||
| s | 640 | |
| p | 640 | |
| a | 434 | |
| i | 320 | 4.6% |
| e | 213 | 3.0% |
| Other values (9) | 743 |
property_value
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 58 |
|---|---|
| Distinct (%) | 18.1% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 408.83438 |
| Minimum | 188 |
|---|---|
| Maximum | 711 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 188 |
|---|---|
| 5-th percentile | 216 |
| Q1 | 277 |
| median | 330 |
| Q3 | 666 |
| 95-th percentile | 666 |
| Maximum | 711 |
| Range | 523 |
| Interquartile range (IQR) | 389 |
Descriptive statistics
| Standard deviation | 170.88897 |
|---|---|
| Coefficient of variation (CV) | 0.41799072 |
| Kurtosis | -1.1841982 |
| Mean | 408.83438 |
| Median Absolute Deviation (MAD) | 79.5 |
| Skewness | 0.63298263 |
| Sum | 130827 |
| Variance | 29203.041 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 666 | 84 | |
| 307 | 25 | 7.5% |
| 403 | 20 | 6.0% |
| 437 | 10 | 3.0% |
| 304 | 9 | 2.7% |
| 398 | 9 | 2.7% |
| 224 | 8 | 2.4% |
| 296 | 8 | 2.4% |
| 384 | 7 | 2.1% |
| 264 | 7 | 2.1% |
| Other values (48) | 133 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 188 | 6 | |
| 193 | 6 | |
| 198 | 1 | 0.3% |
| 216 | 4 | |
| 222 | 5 | |
| 223 | 3 | 0.9% |
| 224 | 8 | |
| 226 | 1 | 0.3% |
| 233 | 6 | |
| 241 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 711 | 4 | 1.2% |
| 666 | 84 | |
| 469 | 1 | 0.3% |
| 437 | 10 | 3.0% |
| 432 | 6 | 1.8% |
| 430 | 2 | 0.6% |
| 422 | 1 | 0.3% |
| 411 | 1 | 0.3% |
| 403 | 20 | 6.0% |
| 402 | 1 | 0.3% |
school_proximity
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 40 |
|---|---|
| Distinct (%) | 15.6% |
| Missing | 76 |
| Missing (%) | 22.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.589494 |
| Minimum | 13 |
|---|---|
| Maximum | 21.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 14.7 |
| Q1 | 17.4 |
| median | 19.1 |
| Q3 | 20.2 |
| 95-th percentile | 21 |
| Maximum | 21.2 |
| Range | 8.2 |
| Interquartile range (IQR) | 2.8 |
Descriptive statistics
| Standard deviation | 2.0755286 |
|---|---|
| Coefficient of variation (CV) | 0.11165062 |
| Kurtosis | -0.12017046 |
| Mean | 18.589494 |
| Median Absolute Deviation (MAD) | 1.1 |
| Skewness | -0.87250548 |
| Sum | 4777.5 |
| Variance | 4.3078189 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20.2 | 77 | |
| 14.7 | 14 | 4.2% |
| 21 | 13 | 3.9% |
| 18.4 | 11 | 3.3% |
| 19.1 | 10 | 3.0% |
| 17.8 | 10 | 3.0% |
| 18.6 | 9 | 2.7% |
| 17.4 | 9 | 2.7% |
| 21.2 | 9 | 2.7% |
| 16.6 | 9 | 2.7% |
| Other values (30) | 86 | |
| (Missing) | 76 |
| Value | Count | Frequency (%) |
| 13 | 6 | |
| 13.6 | 1 | 0.3% |
| 14.7 | 14 | |
| 14.9 | 1 | 0.3% |
| 15.1 | 1 | 0.3% |
| 15.2 | 7 | |
| 15.3 | 2 | 0.6% |
| 15.6 | 2 | 0.6% |
| 15.9 | 1 | 0.3% |
| 16 | 3 | 0.9% |
| Value | Count | Frequency (%) |
| 21.2 | 9 | 2.7% |
| 21.1 | 1 | 0.3% |
| 21 | 13 | 3.9% |
| 20.9 | 6 | 1.8% |
| 20.2 | 77 | |
| 20.1 | 3 | 0.9% |
| 19.7 | 4 | 1.2% |
| 19.6 | 3 | 0.9% |
| 19.2 | 7 | 2.1% |
| 19.1 | 10 | 3.0% |
competitor_density
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 226 |
|---|---|
| Distinct (%) | 70.6% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 359.65756 |
| Minimum | 3.5 |
|---|---|
| Maximum | 396.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 3.5 |
|---|---|
| 5-th percentile | 97.889 |
| Q1 | 376.7225 |
| median | 392.205 |
| Q3 | 396.3525 |
| 95-th percentile | 396.9 |
| Maximum | 396.9 |
| Range | 393.4 |
| Interquartile range (IQR) | 19.63 |
Descriptive statistics
| Standard deviation | 86.048632 |
|---|---|
| Coefficient of variation (CV) | 0.23925156 |
| Kurtosis | 7.9850234 |
| Mean | 359.65756 |
| Median Absolute Deviation (MAD) | 4.695 |
| Skewness | -2.9883666 |
| Sum | 115090.42 |
| Variance | 7404.3671 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 396.9 | 79 | 23.7% |
| 395.24 | 3 | 0.9% |
| 396.21 | 2 | 0.6% |
| 377.07 | 2 | 0.6% |
| 393.37 | 2 | 0.6% |
| 395.11 | 2 | 0.6% |
| 395.56 | 2 | 0.6% |
| 395.63 | 2 | 0.6% |
| 393.23 | 2 | 0.6% |
| 389.71 | 2 | 0.6% |
| Other values (216) | 222 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 3.5 | 1 | |
| 3.65 | 1 | |
| 7.68 | 1 | |
| 9.32 | 1 | |
| 18.82 | 1 | |
| 22.01 | 1 | |
| 27.25 | 1 | |
| 43.06 | 1 | |
| 48.45 | 1 | |
| 50.92 | 1 |
| Value | Count | Frequency (%) |
| 396.9 | 79 | |
| 396.42 | 1 | 0.3% |
| 396.33 | 1 | 0.3% |
| 396.3 | 1 | 0.3% |
| 396.28 | 1 | 0.3% |
| 396.24 | 1 | 0.3% |
| 396.21 | 2 | 0.6% |
| 396.14 | 1 | 0.3% |
| 396.06 | 1 | 0.3% |
| 395.99 | 1 | 0.3% |
household_affluency
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 298 |
|---|---|
| Distinct (%) | 93.1% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.1440078 |
| Minimum | 0.4325 |
|---|---|
| Maximum | 9.4925 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | 0.4325 |
|---|---|
| 5-th percentile | 0.923625 |
| Q1 | 1.80375 |
| median | 2.80875 |
| Q3 | 4.091875 |
| 95-th percentile | 6.693125 |
| Maximum | 9.4925 |
| Range | 9.06 |
| Interquartile range (IQR) | 2.288125 |
Descriptive statistics
| Standard deviation | 1.7740414 |
|---|---|
| Coefficient of variation (CV) | 0.56426114 |
| Kurtosis | 0.77003274 |
| Mean | 3.1440078 |
| Median Absolute Deviation (MAD) | 1.125 |
| Skewness | 0.98958754 |
| Sum | 1006.0825 |
| Variance | 3.1472231 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.5325 | 3 | 0.9% |
| 5.995 | 2 | 0.6% |
| 1.1125 | 2 | 0.6% |
| 1.9 | 2 | 0.6% |
| 1.42 | 2 | 0.6% |
| 1.645 | 2 | 0.6% |
| 2.025 | 2 | 0.6% |
| 1.375 | 2 | 0.6% |
| 1.9475 | 2 | 0.6% |
| 1.3325 | 2 | 0.6% |
| Other values (288) | 299 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| 0.4325 | 1 | |
| 0.495 | 1 | |
| 0.7175 | 1 | |
| 0.72 | 1 | |
| 0.735 | 1 | |
| 0.74 | 1 | |
| 0.7525 | 1 | |
| 0.7825 | 1 | |
| 0.79 | 2 | |
| 0.815 | 1 |
| Value | Count | Frequency (%) |
| 9.4925 | 1 | |
| 9.245 | 1 | |
| 8.6925 | 1 | |
| 8.6025 | 1 | |
| 7.9975 | 1 | |
| 7.6575 | 1 | |
| 7.6475 | 1 | |
| 7.42 | 1 | |
| 7.3875 | 1 | |
| 7.3825 | 1 |
normalised_sales
Real number (ℝ)
HIGH CORRELATION  MISSING 
| Distinct | 188 |
|---|---|
| Distinct (%) | 58.8% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.016966731 |
| Minimum | -1.936974 |
|---|---|
| Maximum | 2.9684773 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 187 |
| Negative (%) | 56.2% |
| Memory size | 2.7 KiB |
Quantile statistics
| Minimum | -1.936974 |
|---|---|
| 5-th percentile | -1.3374188 |
| Q1 | -0.58524963 |
| median | -0.14375902 |
| Q3 | 0.24322658 |
| 95-th percentile | 2.1852402 |
| Maximum | 2.9684773 |
| Range | 4.9054512 |
| Interquartile range (IQR) | 0.82847621 |
Descriptive statistics
| Standard deviation | 0.97856136 |
|---|---|
| Coefficient of variation (CV) | -57.675302 |
| Kurtosis | 1.6704039 |
| Mean | -0.016966731 |
| Median Absolute Deviation (MAD) | 0.4033371 |
| Skewness | 1.1175046 |
| Sum | -5.4293541 |
| Variance | 0.95758233 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.96847726 | 10 | 3.0% |
| -0.2364175427 | 5 | 1.5% |
| -0.3781305781 | 5 | 1.5% |
| -0.3672295754 | 5 | 1.5% |
| 0.1233155474 | 5 | 1.5% |
| -0.5416456191 | 5 | 1.5% |
| 0.03610752556 | 5 | 1.5% |
| 0.2432265774 | 5 | 1.5% |
| 0.003404517369 | 4 | 1.2% |
| 0.177820561 | 4 | 1.2% |
| Other values (178) | 267 | |
| (Missing) | 13 | 3.9% |
| Value | Count | Frequency (%) |
| -1.936973968 | 1 | |
| -1.871567952 | 1 | |
| -1.718953914 | 1 | |
| -1.697151908 | 2 | |
| -1.675349903 | 1 | |
| -1.599042884 | 1 | |
| -1.577240878 | 2 | |
| -1.566339876 | 1 | |
| -1.533636867 | 1 | |
| -1.522735865 | 1 |
| Value | Count | Frequency (%) |
| 2.96847726 | 10 | |
| 2.804962219 | 1 | 0.3% |
| 2.783160213 | 1 | 0.3% |
| 2.53243715 | 1 | 0.3% |
| 2.401625118 | 1 | 0.3% |
| 2.259912082 | 1 | 0.3% |
| 2.216308071 | 1 | 0.3% |
| 2.183605063 | 1 | 0.3% |
| 1.856574981 | 1 | 0.3% |
| 1.649455929 | 1 | 0.3% |
county
Text
MISSING 
| Distinct | 98 |
|---|---|
| Distinct (%) | 30.6% |
| Missing | 13 |
| Missing (%) | 3.9% |
| Memory size | 2.7 KiB |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.046875 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1295 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 9.4% |
Sample
| 1st row | c_40 |
|---|---|
| 2nd row | c_80 |
| 3rd row | c_53 |
| 4th row | c_65 |
| 5th row | c_97 |
| Value | Count | Frequency (%) |
| c_60 | 10 | 3.1% |
| c_61 | 10 | 3.1% |
| c_50 | 10 | 3.1% |
| c_45 | 9 | 2.8% |
| c_72 | 9 | 2.8% |
| c_68 | 8 | 2.5% |
| c_48 | 8 | 2.5% |
| c_39 | 7 | 2.2% |
| c_63 | 7 | 2.2% |
| c_62 | 7 | 2.2% |
| Other values (88) | 235 |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 320 | |
| _ | 320 | |
| 6 | 89 | 6.9% |
| 4 | 82 | 6.3% |
| 5 | 80 | 6.2% |
| 7 | 69 | 5.3% |
| 3 | 67 | 5.2% |
| 2 | 64 | 4.9% |
| 1 | 55 | 4.2% |
| 8 | 54 | 4.2% |
| Other values (2) | 95 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 655 | |
| Lowercase Letter | 320 | |
| Connector Punctuation | 320 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 6 | 89 | |
| 4 | 82 | |
| 5 | 80 | |
| 7 | 69 | |
| 3 | 67 | |
| 2 | 64 | |
| 1 | 55 | |
| 8 | 54 | |
| 9 | 52 | |
| 0 | 43 |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 320 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 320 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 975 | |
| Latin | 320 | 24.7% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| _ | 320 | |
| 6 | 89 | 9.1% |
| 4 | 82 | 8.4% |
| 5 | 80 | 8.2% |
| 7 | 69 | 7.1% |
| 3 | 67 | 6.9% |
| 2 | 64 | 6.6% |
| 1 | 55 | 5.6% |
| 8 | 54 | 5.5% |
| 9 | 52 | 5.3% |
Latin
| Value | Count | Frequency (%) |
| c | 320 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1295 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 320 | |
| _ | 320 | |
| 6 | 89 | 6.9% |
| 4 | 82 | 6.3% |
| 5 | 80 | 6.2% |
| 7 | 69 | 5.3% |
| 3 | 67 | 5.2% |
| 2 | 64 | 4.9% |
| 1 | 55 | 4.2% |
| 8 | 54 | 4.2% |
| Other values (2) | 95 | 7.3% |
is_test
Boolean
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 461.0 B |
| False | |
|---|---|
| True | 13 |
| Value | Count | Frequency (%) |
| False | 320 | |
| True | 13 | 3.9% |
| Distinct | 13 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 320 |
| Missing (%) | 96.1% |
| Memory size | 2.7 KiB |
Length
| Max length | 135 |
|---|---|
| Median length | 131 |
| Mean length | 124.76923 |
| Min length | 100 |
Characters and Unicode
| Total characters | 1622 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 105,0.03996809999999999,34.0,6.09,no,4.150000000000001,3.59,59.6,5.4917,Many transport options,329,16.1,395.75,2.375,c_42 |
|---|---|
| 2nd row | 400,0.5877581999999999,20.0,3.97,no,14.850000000000001,5.398,8.5,2.2885,Average transport options,264,13.0,386.86,1.4775,c_140 |
| 3rd row | 338,1.1169258999999998,0.0,8.14,no,9.399999999999997,2.8129999999999997,0.0,4.0952,Few transport options,307,,394.54,4.97,c_55 |
| 4th row | 227,1.5174092,0.0,19.58,no,12.75,3.066,0.0,1.7573,Average transport options,403,14.7,353.89,1.6075,c_62 |
| 5th row | 114,83.093533,0.0,18.1,no,16.450000000000003,2.9570000000000007,0.0,1.8026,All transport options,666,20.2,16.45,5.155000000000001,c_22 |
| Value | Count | Frequency (%) |
| transport | 13 | |
| 105,0.03996809999999999,34.0,6.09,no,4.150000000000001,3.59,59.6,5.4917,many | 1 | 2.6% |
| options,233,17.9,383.37,1.4525,c_63 | 1 | 2.6% |
| 363,0.1482221,0.0,8.56,no,,3.1270000000000007,14.799999999999997,2.1224,average | 1 | 2.6% |
| options,307,17.4,385.91,0.6175,c_122 | 1 | 2.6% |
| 148,0.6500776999999999,0.0,6.2,no,7.850000000000001,5.337,26.700000000000003,3.8384,many | 1 | 2.6% |
| options,666,,385.09,4.3175,c_56 | 1 | 2.6% |
| 136,5.751892099999999,0.0,18.1,no,18.15,3.2970000000000006,8.200000000000003,2.3682,all | 1 | 2.6% |
| options,287,19.6,393.68,1.27,c_58 | 1 | 2.6% |
| 341,0.2168018,0.0,7.38,no,7.15,3.431,85.3,5.4159,average | 1 | 2.6% |
| Other values (17) | 17 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 235 | |
| 9 | 225 | |
| , | 182 | |
| . | 127 | 7.8% |
| 1 | 83 | 5.1% |
| 3 | 73 | 4.5% |
| 2 | 70 | 4.3% |
| 5 | 66 | 4.1% |
| 8 | 61 | 3.8% |
| 7 | 60 | 3.7% |
| Other values (23) | 440 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 971 | |
| Other Punctuation | 309 | 19.1% |
| Lowercase Letter | 290 | 17.9% |
| Space Separator | 26 | 1.6% |
| Connector Punctuation | 13 | 0.8% |
| Uppercase Letter | 13 | 0.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 53 | |
| n | 40 | |
| t | 39 | |
| r | 30 | |
| s | 27 | |
| p | 26 | |
| a | 19 | 6.6% |
| i | 13 | 4.5% |
| c | 13 | 4.5% |
| e | 10 | 3.4% |
| Other values (5) | 20 | 6.9% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 235 | |
| 9 | 225 | |
| 1 | 83 | 8.5% |
| 3 | 73 | 7.5% |
| 2 | 70 | 7.2% |
| 5 | 66 | 6.8% |
| 8 | 61 | 6.3% |
| 7 | 60 | 6.2% |
| 6 | 55 | 5.7% |
| 4 | 43 | 4.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 8 | |
| M | 2 | 15.4% |
| N | 2 | 15.4% |
| F | 1 | 7.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 182 | |
| . | 127 |
Space Separator
| Value | Count | Frequency (%) |
| 26 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 13 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1319 | |
| Latin | 303 | 18.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 53 | |
| n | 40 | |
| t | 39 | |
| r | 30 | |
| s | 27 | |
| p | 26 | |
| a | 19 | 6.3% |
| i | 13 | 4.3% |
| c | 13 | 4.3% |
| e | 10 | 3.3% |
| Other values (9) | 33 |
Common
| Value | Count | Frequency (%) |
| 0 | 235 | |
| 9 | 225 | |
| , | 182 | |
| . | 127 | |
| 1 | 83 | 6.3% |
| 3 | 73 | 5.5% |
| 2 | 70 | 5.3% |
| 5 | 66 | 5.0% |
| 8 | 61 | 4.6% |
| 7 | 60 | 4.5% |
| Other values (4) | 137 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1622 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 235 | |
| 9 | 225 | |
| , | 182 | |
| . | 127 | 7.8% |
| 1 | 83 | 5.1% |
| 3 | 73 | 4.5% |
| 2 | 70 | 4.3% |
| 5 | 66 | 4.1% |
| 8 | 61 | 3.8% |
| 7 | 60 | 3.7% |
| Other values (23) | 440 |
| commercial_property | competitor_density | crime_rate | household_affluency | household_size | is_test | location_id | normalised_sales | property_value | proportion_flats | proportion_newbuilds | proportion_nonretail | public_transport_dist | school_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| commercial_property | 1.000 | -0.277 | 0.779 | 0.612 | -0.349 | 1.000 | 0.013 | -0.551 | 0.640 | -0.627 | -0.788 | 0.756 | -0.857 | 0.399 |
| competitor_density | -0.277 | 1.000 | -0.328 | -0.202 | 0.049 | 1.000 | 0.042 | 0.145 | -0.287 | 0.157 | 0.210 | -0.285 | 0.226 | -0.072 |
| crime_rate | 0.779 | -0.328 | 1.000 | 0.640 | -0.367 | 1.000 | 0.059 | -0.545 | 0.735 | -0.558 | -0.684 | 0.716 | -0.713 | 0.466 |
| household_affluency | 0.612 | -0.202 | 0.640 | 1.000 | -0.650 | 1.000 | -0.019 | -0.862 | 0.536 | -0.482 | -0.656 | 0.649 | -0.580 | 0.436 |
| household_size | -0.349 | 0.049 | -0.367 | -0.650 | 1.000 | 1.000 | -0.090 | 0.641 | -0.318 | 0.393 | 0.301 | -0.473 | 0.329 | -0.263 |
| is_test | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| location_id | 0.013 | 0.042 | 0.059 | -0.019 | -0.090 | NaN | 1.000 | 0.033 | -0.011 | 0.035 | -0.014 | 0.014 | 0.013 | 0.033 |
| normalised_sales | -0.551 | 0.145 | -0.545 | -0.862 | 0.641 | NaN | 0.033 | 1.000 | -0.546 | 0.442 | 0.562 | -0.586 | 0.456 | -0.527 |
| property_value | 0.640 | -0.287 | 0.735 | 0.536 | -0.318 | NaN | -0.011 | -0.546 | 1.000 | -0.370 | -0.535 | 0.658 | -0.557 | 0.474 |
| proportion_flats | -0.627 | 0.157 | -0.558 | -0.482 | 0.393 | NaN | 0.035 | 0.442 | -0.370 | 1.000 | 0.550 | -0.633 | 0.611 | -0.451 |
| proportion_newbuilds | -0.788 | 0.210 | -0.684 | -0.656 | 0.301 | NaN | -0.014 | 0.562 | -0.535 | 0.550 | 1.000 | -0.679 | 0.822 | -0.427 |
| proportion_nonretail | 0.756 | -0.285 | 0.716 | 0.649 | -0.473 | NaN | 0.014 | -0.586 | 0.658 | -0.633 | -0.679 | 1.000 | -0.747 | 0.500 |
| public_transport_dist | -0.857 | 0.226 | -0.713 | -0.580 | 0.329 | NaN | 0.013 | 0.456 | -0.557 | 0.611 | 0.822 | -0.747 | 1.000 | -0.374 |
| school_proximity | 0.399 | -0.072 | 0.466 | 0.436 | -0.263 | NaN | 0.033 | -0.527 | 0.474 | -0.451 | -0.427 | 0.500 | -0.374 | 1.000 |
| location_id | crime_rate | proportion_flats | proportion_nonretail | new_store | commercial_property | household_size | proportion_newbuilds | public_transport_dist | transport_availability | property_value | school_proximity | competitor_density | household_affluency | normalised_sales | county | is_test | location_id,crime_rate,proportion_flats,proportion_nonretail,new_store,commercial_property,household_size,proportion_newbuilds,public_transport_dist,transport_availability,property_value,school_proximity,competitor_density,household_affluency,county | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 464.0 | 17.600541 | 0.0 | 18.10 | no | NaN | 2.926 | 29.0 | 2.9084 | All transport options | 666.0 | 20.2 | 368.74 | 4.5325 | -0.399933 | c_40 | False | NaN |
| 1 | 504.0 | 0.603556 | 20.0 | 3.97 | no | 14.85 | 4.520 | 10.6 | 2.1398 | Average transport options | 264.0 | 13.0 | 388.37 | 1.8150 | 2.216308 | c_80 | False | NaN |
| 2 | 295.0 | 0.606810 | 0.0 | 6.20 | no | 7.70 | 2.981 | 31.9 | 3.6715 | Many transport options | 307.0 | 17.4 | 378.35 | 2.9125 | 0.166920 | c_53 | False | NaN |
| 3 | 187.0 | 0.012385 | 55.0 | 2.25 | no | 1.95 | 3.453 | 68.1 | 7.3073 | No transport options | 300.0 | 15.3 | 394.72 | 2.0575 | -0.083804 | c_65 | False | NaN |
| 4 | 193.0 | 0.016182 | 100.0 | 1.32 | no | 3.05 | 3.816 | 59.5 | 8.3248 | Average transport options | 256.0 | 15.1 | 392.90 | 0.9875 | 0.962693 | c_97 | False | NaN |
| 5 | 160.0 | 0.068659 | 0.0 | 11.93 | no | 11.15 | 3.976 | 9.0 | 2.1675 | No transport options | 273.0 | 21.0 | 396.90 | 1.4100 | 0.123316 | c_69 | False | NaN |
| 6 | 43.0 | 0.254126 | 12.5 | 7.87 | no | 8.70 | 3.377 | 5.7 | 6.3467 | Average transport options | 311.0 | 15.2 | 392.52 | 5.1125 | -0.846874 | c_22 | False | NaN |
| 7 | 278.0 | 6.581131 | 0.0 | 18.10 | no | 9.10 | 3.242 | 35.3 | 3.4242 | All transport options | 666.0 | 20.2 | 396.90 | 2.6850 | 0.025207 | c_54 | False | NaN |
| 8 | 387.0 | 17.922139 | 0.0 | 18.10 | no | 16.45 | 2.896 | 4.6 | 1.9096 | All transport options | 666.0 | 20.2 | 7.68 | 6.0975 | -1.577241 | c_51 | False | NaN |
| 9 | 98.0 | 5.437707 | 0.0 | 18.10 | no | 18.15 | 3.701 | 10.0 | 2.5975 | All transport options | 666.0 | 20.2 | 255.23 | 4.1050 | -0.694260 | c_47 | False | NaN |
| location_id | crime_rate | proportion_flats | proportion_nonretail | new_store | commercial_property | household_size | proportion_newbuilds | public_transport_dist | transport_availability | property_value | school_proximity | competitor_density | household_affluency | normalised_sales | county | is_test | location_id,crime_rate,proportion_flats,proportion_nonretail,new_store,commercial_property,household_size,proportion_newbuilds,public_transport_dist,transport_availability,property_value,school_proximity,competitor_density,household_affluency,county | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 323 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 227,1.5174092,0.0,19.58,no,12.75,3.066,0.0,1.7573,Average transport options,403,14.7,353.89,1.6075,c_62 |
| 324 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 114,83.093533,0.0,18.1,no,16.450000000000003,2.9570000000000007,0.0,1.8026,All transport options,666,20.2,16.45,5.155000000000001,c_22 |
| 325 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 203,10.988323399999999,0.0,18.1,no,19.5,3.4060000000000006,2.799999999999997,2.0651,All transport options,666,20.2,385.96,4.88,c_19 |
| 326 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 12,0.15989499999999998,0.0,6.91,no,4.899999999999999,3.1689999999999996,93.4,5.7209,No transport options,233,17.9,383.37,1.4525,c_63 |
| 327 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 199,9.3419925,0.0,18.1,yes,15.899999999999997,2.875,10.400000000000006,1.1296,All transport options,666,20.2,347.88,2.22,c_107 |
| 328 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 477,0.010237799999999998,90.0,2.97,no,2.500000000000002,4.087999999999999,79.2,7.3073,No transport options,285,15.3,394.72,1.9625,c_69 |
| 329 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 341,0.2168018,0.0,7.38,no,7.15,3.431,85.3,5.4159,Average transport options,287,19.6,393.68,1.27,c_58 |
| 330 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 136,5.751892099999999,0.0,18.1,no,18.15,3.2970000000000006,8.200000000000003,2.3682,All transport options,666,,385.09,4.3175,c_56 |
| 331 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 148,0.6500776999999999,0.0,6.2,no,7.850000000000001,5.337,26.700000000000003,3.8384,Many transport options,307,17.4,385.91,0.6175,c_122 |
| 332 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True | 363,0.1482221,0.0,8.56,no,,3.1270000000000007,14.799999999999997,2.1224,Average transport options,384,20.9,387.69,3.5225,c_63 |